The Doubly Regularized Support Vector Machine
نویسندگان
چکیده
The standard L2-norm support vector machine (SVM) is a widely used tool for classification problems. The L1-norm SVM is a variant of the standard L2norm SVM, that constrains the L1-norm of the fitted coefficients. Due to the nature of the L1-norm, the L1-norm SVM has the property of automatically selecting variables, not shared by the standard L2-norm SVM. It has been argued that the L1-norm SVM may have some advantage over the L2-norm SVM, especially with high dimensional problems and when there are redundant noise variables. On the other hand, the L1-norm SVM has two drawbacks: (1) when there are several highly correlated variables, the L1-norm SVM tends to pick only a few of them, and remove the rest; (2) the number of selected variables is upper bounded by the size of the training data. A typical example where these occur is in gene microarray analysis. In this paper, we propose a doubly regularized support vector machine (DrSVM). The DrSVM uses the elastic-net penalty, a mixture of the L2-norm and the L1-norm penalties. By doing so, the DrSVM performs automatic variable selection in a way similar to the L1-norm SVM. In addition, the DrSVM encourages highly correlated variables to be selected (or removed) together. We illustrate how the DrSVM can be particularly useful when the number of variables is much larger than the size of the training data (p n). We also develop efficient algorithms to compute the whole solution paths of the DrSVM.
منابع مشابه
Efficient variable selection in support vector machines via the alternating direction method of multipliers
The support vector machine (SVM) is a widely used tool for classification. Although commonly understood as a method of finding the maximum-margin hyperplane, it can also be formulated as a regularized function estimation problem, corresponding to a hinge loss function plus an l2-norm regulation term. The doubly regularized support vector machine (DrSVM) is a variant of the standard SVM, which i...
متن کاملA robust regularization path for the Doubly Regularized Support Vector Machine
The Doubly Regularized SVM (DrSVM) is an extension of SVM using a mixture of L2 and L1 norm penalties. This kind of penalty, sometimes referred as the elastic net, allows to perform variable selection while taking into account correlations between variables. Introduced by Wang [1], an e cient algorithm to compute the whole DrSVM solution path has been proposed. Unfortunately, in some cases, thi...
متن کاملSelecting Parameters of an Improved Doubly Regularized Support Vector Machine based on Chaotic Particle Swarm Optimization Algorithm
Taking full advantages of the L1-norm support vector machine and the L2-norm support vector machine, a new improved double regularization support vector machine is proposed to analyze the datasets with small samples, high dimensions and high correlations in the parts of the variables. A kind of smooth function is used to approximately overcome the disdifferentiability of the L1-norm and the ste...
متن کاملA Combined Vector and Direct Power Control for AC/DC/AC Converters in DFIG Based Wind Turbine
The doubly-fed generators (DFIG) have clear superiority for the applications of large capacity and limited-range speed control case due to the partially rated inverter, lower cost and high reliability. These characteristics enable the doubly-fed wound rotor induction machine to have vast applications in wind-driven generation.In this paper Combined Vector and direct power control (CVDPC) strate...
متن کاملFault diagnosis in a distillation column using a support vector machine based classifier
Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...
متن کامل